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Standard  ML  employs  an  opaque  (or  generative)  interpretation  of  datatype 
specifications,  in  which  every  datatype  specification  provides  a  new,  abstract 
type  that  is  different  from  any  other  type,  including  other  identically  specified 
datatypes.  An  alternative  interpretation  is  the  transparent  one,  in  which  a 
datatype  specification  exposes  the  underlying  recursive  type  implementation  of 
the  datatype. 

It  is  commonly  believed  that  the  transparent  interpretation  is  strictly  more 
permissive  than  the  opaque  interpretation;  that  all  programs  typable  under 
the  opaque  discipline  are  also  typable  under  the  transparent  discipline.  The 
purpose  of  this  note  is  to  illustrate  that  this  common  belief  is  incorrect  (in  the 
usual  equational  theory  for  types),  and  to  discuss  some  of  the  implications  of 
that  fact. 


1  An  Example 

To  see  the  issue  involved,  consider  the  signatures  SIGl  and  SIG2: 

signature  SIGl  = 
sig 

datatype  u=Cofu*u  |  Dof  int 

type  t  =  u  *  u 

end 

signature  SIG2  = 
sig 

type  t 

datatype  u  =  C  of  t  |  D  of  int 

end 

Is  SIGl  a  subsignature  of  SIG2?  In  an  opaque  interpretation  (and  in  Stan¬ 
dard  ML  [3])  the  answer  is  yes.  But  in  a  transparent  interpretation  the  answer 
is  no.  To  show  why  this  is  so,  we  give  the  opaque  and  transparent  interpreta¬ 
tions  of  SIGl  and  SIG2  in  a  type  theory  without  datatypes  but  with  sums  and 
iso-recursive  types  (recursive  types  in  which  fold  and  unfold  must  be  mediated 
by  an  explicit  isomorphism). 

In  an  opaque  interpretation,  a  datatype  specification  provides  an  abstract 
type  along  with  introduction  and  elimination  functions  for  that  type  [2] : 
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signature  SIGl .opaque  = 
sig 

type  u 

type  t  =  u  *  u 

val  u.in  :  (u  *  u  +  int)  ->  u 

val  u.out  :  u  ->  (u  *  u  +  int) 

end 

signature  SIG2.opaque  = 
sig 

type  t 
type  u 

val  u_in  :  (t  +  int)  ->  u 

val  u.out  :  u  ->  (t  +  int) 

end 

In  this  interpretation,  SIGl  matches  SIG2  because  (u  *  u  -{-  int)  u  is 
equal  to  (t  +  int)  — u  under  the  assumption  that  t  =  u  *  u,  and  similarly 
u  (u  *  u  +  int)  is  equal  to  u  (t  +  int). 

However,  in  a  transparent  interpretation,  a  datatype  specification  exposes 
the  underlying  recursive  type: 

signature  SIGl. transparent  = 
sig 

type  u  =  /io.  a  *  a  +  int 
type  t  =  u  *  u 

end 

signature  SIG2. transparent  = 
sig 

type  t 

type  u  =  t  +  int 

end 

In  this  interpretation,  SIGl  does  not  match  SIG2  because  u’s  abbreviation 
in  SIGl  is  not  equal  to  its  abbreviation  in  SIG2.  Invoking  t  =  u  *  u,  the  latter 
may  be  shown  equal  to 

//a.  (/ia.  a  *  a  +  int)  *  (pa.  a*  a  int)  +  int 

which,  in  the  usual  equational  theory  for  types,  is  not  the  same  as  SIGl’s  ab¬ 
breviation: 

pa.  a  *  a  -f  int 

What  is  happening  here  is,  in  order  for  SIGl  to  match  SIG2,  the  datatype 
specification  for  u  in  SIG2  must  be  able  to  “capture”  a  definition  given  to  t, 
even  when  t  is  defined  in  terms  of  u.  This  is  possible  in  the  opaque  setting 
because  t  and  u  are  independent  abstract  types,  and  any  interplay  between 
them  is  deferred  to  value  fields.  In  a  transparent  setting,  the  necessary  capture 
is  impossible;  u  and  a  are  different  variables  and  the  recursive  binding  of  a 
cannot  capture  any  occurrences  of  u. 
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2  Implications 

This  example  illustrates  that  under  the  usual  equality  rules  for  iso- recursive 
types,  Standard  ML  is  incompatible  with  a  transparent  interpretation.  How¬ 
ever,  in  an  implementation  it  is  unacceptable  to  incur  the  cost  of  a  function  call 
for  every  datatype  construction  and  pattern  match,  so  the  transparent  inter¬ 
pretation  is  required.  In  a  type-preserving  compiler,  one  may  adopt  internally 
a  new  interpretation  of  the  language,  but  only  when  that  internal  interpreta¬ 
tion  is  at  least  as  permissive  as  the  external  one,  which  we  have  shown  is  not 
the  case  here.  This  poses  no  problem  to  those  compilers  that  erase  types  be¬ 
fore  compiling,  but  how  can  Standard  ML  be  implemented  in  a  type-preserving 
manner? 

Shao,  in  the  FLINT  compiler  [5],  addresses  this  problem  with  what  we  call 
“Shao’s  equation”  (where  we  write  E[E^ /X]  to  mean  the  capture- avoiding  sub¬ 
stitution  of  E^  for  X  in  E): 


^a.T  “  fia.{T[jjLa.T/a]) 

Shao’s  equation  addresses  the  problem  with  the  example  above  by  rendering 
the  two  abbreviations  equal.  More  generally,  for  any  equational  theory  one  may 
prove  that  if  the  transparent  interpretation  accepts  every  program  typable  under 
the  opaque  interpretation,  then  that  theory  must  include  all  instances  of  Shao’s 
equation.  Thus,  we  argue  that  Shao’s  equation  is  essential  to  efficient,  type¬ 
preserving  compilation  of  languages  with  opaque  datatypes,  such  as  Standard 
ML. 

Note  that  this  equation  falls  short  of  the  equation  for  equi-recursive  types 
(recursive  types  in  which  fold  and  unfold  need  not  be  performed  explicitly): 

A^equi  Q;.T  =:  ^‘[/iequi 

Since  the  right-hand  side  of  Shao’s  equation  is  still  a  recursive  type  (in  contrast 
to  the  right-hand  side  of  the  equi-recursive  type  equation)  it  is  possible  that 
the  type  equality  problem  with  Shao’s  equation  may  be  solved  more  efficiently 
than  the  problem  for  equi-recursive  types  [1].  Indeed,  Shao  claims  to  have  an 
efficient  algorithm  for  the  problem  [4]. 

Nevertheless,  there  is  some  question  as  to  the  validity  of  Shao’s  equation.  In 
many  semantic  contexts,  though  certainly  not  all,  the  equation  may  be  justifi¬ 
able.  Note  that  terms  having  the  left-hand  side  type, 

\  def 

fiCX.T  —  'Heft 

and  terms  having  the  right-hand  side  type, 

(ia.{T[fia.T/a])  Tright 
both  unfold  to  members  of  the  same  type: 

r[^a.r/a] 
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Thus,  terms  may  be  coerced  from  one  type  to  the  other  by  unfolding  them  at 
one  type  and  refolding  them  at  the  other  type.  For  instance,  if  e  has  type  Tieft, 
then  unfold[rieft]e  has  type  r[//a.r/a],  and  so  fold[rnght](unf old[7ieft]e)  has 
type  Tfight  • 

Thus,  Shao’s  equation  is  justifiable  in  a  semantic  framework  in  which  such 
an  fold-unfold  operation  (at  different  types)  is  the  identity.  (Fold-unfold  at 
the  same  type  would  be  the  identity  in  nearly  any  semantic  framework.)  A 
particularly  important  case  where  this  is  true  is  when  fold  and  unfold  themselves 
are  no-ops,  as  is  the  case  in  most  implementations. 

3  Conclusions 

Opaque  datatypes  are  purported  to  carry  software  engineering  benefits,  but 
datatypes  must  be  transparent,  at  least  internally,  to  achieve  efficient  compi¬ 
lation.  Were  the  transparent  discipline  more  permissive  than  the  opaque  one, 
this  would  not  pose  a  problem,  but  we  show  that  this  is  not  so. 

The  opaque  and  transparent  disciplines  can  be  reconciled  only  by  adopting 
Shao’s  equation.  Therefore,  we  argue  that  Shao’s  equation  is  essential  to  effi¬ 
cient,  type-preserving  compilation  of  any  language  with  opaque  datatypes.  This 
equation  is  not  valid  in  every  semantic  context,  and  although  it  may  be  permissi¬ 
ble  in  many  important  ones,  at  the  very  least  it  complicates  typechecking.  Thus, 
there  are  good  reasons  why  one  could  prefer  to  reject  Shao’s  equation.  However, 
in  a  type-preserving  compiler,  if  we  wish  not  to  embrace  Shao’s  equation,  we 
are  left  with  no  choice  but  to  abandon  opaque  datatypes  as  well. 
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A  Proof 


Suppose  an  equational  theory  is  given  and  suppose  that  the  transparent  inter¬ 
pretation  accepts  every  program  typable  under  the  opaque  interpretation.  Let 
r[a]  be  an  arbitrary  type  with  free  variable  a  and  let  SIGl'  and  SIG2'  be 
defined  as  follows: 

signature  SIGl'  = 
sig 

datatype  u  =  C  of  r[u] 

type  t  =  r[u] 

end 

signature  SIG2'  = 
sig 

type  t 

datatype  u  =  C  of  t 

end 


In  the  opaque  interpretation  SIGl^  matches  SIG2\  so  SIGl’  must  match 
SIG2’  under  the  transparent  interpretation  as  well.  In  SIG1>  the  abbreviation 
for  u  is 

//o.r[a] 


and  in  SIG2'  it  is 


//a.t 


which,  invoking  t  =  r[u],  is  equal  to 

Since  these  abbreviations  must  be  equal,  we  conclude 

^a.r  [of]  =  ;/a.r[/ia.r[a]] 
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